-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a draft for LAI #4771
Added a draft for LAI #4771
Conversation
Please do not merge yet. I need to add test data. The current genome files are either too small or have been soft masked where as this tool works on the repeat space and requires an unmasked genome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted to draft. When you're ready for reviewing for merge, click ready for review.
modules/nf-core/lai/main.nf
Outdated
sed \\ | ||
'/^>/ s/\\s.*\$//' \\ | ||
$fasta \\ | ||
> for_lai_no_comments.fsa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reproducibility/output provanence, add this file as an optional output, and use an input boolean to say whether this file is kept or rm
'ed in the script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this logic entirely as it was a partial solution. We need a separate module to format a file for LAI as it also requires alpha numeric fasta ids of 13 characters max. I have implemented a module which creates a mapping and a separate module which does reverse mapping. I'll upload those as part of the FASTA_LTRRETRIEVER_LAI workflow.
modules/nf-core/lai/environment.yml
Outdated
- bioconda | ||
- defaults | ||
dependencies: | ||
- "bioconda::LTR_retriever=2.9.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The placing of these modules is getting confusing. The package names and the tools inside them are not really findable with what we've been doing lately. I'm starting to think we need to enforce that the tool should be named after the package it's in followed by the tool name. I.e. in this case we get LTRRETRIEVER_LAI
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. Let's move it to LTRRETRIEVER_LAI
because historically LAI never had its own GitHub repo or a Bioconda recipe. It is a script which has its own publication but can only be applied to the outputs from LTRRETRIEVER
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the module is LTRRETRIEVER_LAI
modules/nf-core/lai/main.nf
Outdated
mv \\ | ||
$lai_output_name \\ | ||
"${prefix}.LAI.out" \\ | ||
|| echo "LAI did not produce the output file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|| echo "LAI did not produce the output file" | |
|| echo "No LTR annotations were found by RepeatMasker." |
I think it would be better to echo what the implication is. I'm not sure my suggestion is correct, but please state what no output implies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated this to echo "LAI failed to estimate assembly index. See ${prefix}.LAI.log"
modules/nf-core/lai/main.nf
Outdated
def args = task.ext.args ?: '' | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
def monoploid_param = monoploid_seqs ? "-mono $monoploid_seqs" : '' | ||
def lai_output_name = monoploid_seqs ? "${annotation_out}.${monoploid_seqs}.out.LAI" : "${annotation_out}.LAI" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When monoploid_seqs
is []
this name is going to be ${annotation_out}..out.LAI
(i.e. with the double .
). Is that really the case when monoploid_seqs
is not given, otherwise the mv
will fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I need to add a test to verify this functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I am testing this behaviour here
test("stub") { |
and here
test("stub_with_monoploid_seqs") { |
Thank you @mahesh-panchal
I have added test data here: nf-core/test-datasets#1095. Kindly review the test data PR if possible for you. Many Thanks! |
* Added a draft for LAI * Added tests with larger local data * Added ltrharvest to lai test * Updated tests for lai * Removed unnecessary args
* Added a draft for LAI * Added tests with larger local data * Added ltrharvest to lai test * Updated tests for lai * Removed unnecessary args
PR checklist
Closes #XXX
versions.yml
file.label
PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware